Regionalisation

In this note, we propose to test the use of the hierarchical classification algorithm with contiguity containment described in Guénard and Legendre (2022) and to apply it to FAO food balance sheets mesured in Kcal. The data are related to the year 2022, but it is possible to study time series back to 1961 under certain conditions (change in country borders, evolution of FAO nomenclature, etc.). It is alos possible to change the criteria of measurement, the graph of proximity, …

Data

Geometry

We select those countries in the world for which FAO data are available in 2022, removing Western Sahara and North Korea, for example. We keep small island states which will oblige us to choose a weight matrix different from classicall measure based on border contiguity.

Geopolitical network

As discussed in another part of the WorlRegio project (see. here)[https://worldregio.github.io/world_geom/geopolitical_network.html], thay are plenty of solutions for the elaboration of a network of proximity between states (i.e. a weight matrix that will be used as constraint in the clustering algorithm).

We choose here a method of Voronoï-Delaunay triangulation on map where we have used a polar projection on the North

As you can see, all states are connected, eventually at long distance as we can see in the example of the link between Canada and Korea or French Polynesia and Chile.

This solution is without any doubt likely to be criticized because it depends on the projection, on the choice of the center of states adopted, on the fact to keep or eliminate small states, etc. But our purpose is mainly pedagogical here and it is not the time to discuss the choice of the best geopolitical network. Just keep in mind that this choice has important implications on the results.

Dissimilarity matrix

We begin directly with the case of a dissimilarity matrix using all 95 items proposed by the FAO to describe a country’s diet. The table is of the following form. The unit of measurement is Kcal per person per day.

Données brutes
2511 2513 2514 2515 2516 2517 2518 2520 2531 2532 2533 2534 2535 2536 2537 2541 2542 2543 2546 2547 2549 2551 2552 2555 2557 2558 2559 2560 2561 2563 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2586 2601 2602 2605 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2625 2630 2633 2635 2640 2641 2642 2645 2655 2656 2657 2658 2680 2731 2732 2733 2734 2735 2736 2737 2740 2743 2744 2745 2761 2762 2763 2764 2765 2766 2767 2769 2775 2781 2782 2807 2848 2899
AFG 1317.18 0.52 45.63 1.39 0.06 0.60 0.06 5.74 44.33 0.02 0.00 0.00 0.00 0.00 0 0 83.18 1.51 0.00 16.99 9.83 21.05 4.28 0.05 0.00 0.60 0.0 0.04 10.88 0.01 0.25 2.58 0.00 24.55 0.29 3.81 0.07 152.66 0.00 1.46 0.56 0 0.16 30.26 0.00 6.60 33.40 2.07 0.01 0.00 1.07 4.25 0.00 9.28 0.01 4.31 23.87 11.49 0.02 0.86 0.16 0.09 0.00 0.03 3.35 0.00 0.00 0.00 0.00 0.25 11.22 18.90 0.03 3.29 1.05 3.21 8.26 21.14 0.03 2.47 0.30 0.69 0.00 0.12 0.01 0.00 0.00 0.00 0.00 0 0 0.00 202.34 85.27 3.42
AGO 138.23 0.00 409.17 0.00 0.44 6.57 0.00 0.19 13.90 748.76 100.23 0.01 0.00 0.00 0 0 81.73 8.57 53.75 0.00 6.86 1.06 38.49 6.04 0.00 0.08 0.0 0.26 1.30 0.15 0.00 72.25 21.37 5.68 0.08 0.25 5.53 124.92 0.08 0.00 2.53 0 0.01 17.06 9.26 0.14 26.96 0.18 0.02 0.00 19.95 68.73 0.00 0.27 14.73 0.03 0.05 28.25 0.80 1.91 0.02 0.03 0.02 0.00 0.18 2.66 40.25 27.01 4.71 1.34 18.96 2.81 54.30 39.62 1.07 7.77 7.39 0.78 0.54 6.28 5.63 2.45 3.34 17.96 1.58 0.08 0.01 0.00 0.00 0 0 0.00 129.37 17.64 4.19
ALB 854.36 1.92 21.71 5.00 1.53 0.00 0.00 0.71 88.70 0.01 0.05 0.00 0.00 0.00 0 0 189.24 39.97 44.64 0.02 4.41 54.97 6.05 0.03 0.00 0.66 0.0 0.91 0.00 37.13 0.00 6.15 1.25 91.46 0.06 0.00 0.00 0.00 0.00 0.00 119.67 0 6.00 27.94 53.67 37.60 160.22 15.64 3.25 0.04 0.19 16.83 0.00 47.30 0.17 21.17 131.89 83.24 2.16 29.69 0.22 0.23 2.06 0.00 0.16 5.41 44.41 0.02 9.80 1.61 76.02 46.59 35.57 85.67 1.98 18.07 51.30 22.76 1.69 59.16 16.25 3.30 1.49 7.40 0.63 1.53 1.90 0.25 0.00 0 0 0.00 57.52 609.73 20.12
ARE 860.55 0.03 24.84 0.03 1.02 0.00 0.00 0.49 59.22 2.54 2.02 0.45 0.68 0.08 0 0 207.81 18.13 28.29 28.12 113.04 39.88 9.26 0.86 2.56 0.94 9.8 9.13 12.62 5.93 0.21 50.78 3.59 21.71 124.82 0.58 1.87 317.36 13.85 1.45 34.97 0 31.31 18.98 4.78 21.64 56.75 12.22 3.91 0.39 0.76 21.22 3.94 19.39 1.93 44.38 6.33 21.03 0.61 24.14 0.08 2.68 8.64 0.00 12.83 4.10 6.80 0.11 0.00 3.74 58.67 44.37 4.25 208.40 12.10 7.93 17.11 48.79 0.00 41.10 8.59 11.54 8.44 23.39 0.81 1.17 1.64 0.31 0.00 0 0 0.00 310.26 156.85 13.49
ARG 876.56 0.00 93.00 0.00 7.55 0.00 0.00 3.33 64.34 5.65 6.49 0.00 0.00 0.00 0 0 309.25 61.68 36.14 27.18 18.95 5.53 25.99 0.02 0.00 0.00 0.0 1.18 0.00 1.99 0.00 66.60 21.04 287.14 1.32 1.19 0.00 0.00 1.09 0.02 5.52 0 30.02 14.43 15.72 11.83 28.87 23.83 7.73 0.82 0.04 23.86 0.11 5.89 0.65 0.06 5.95 16.92 2.06 12.41 8.22 0.33 0.90 0.01 1.10 40.88 50.83 0.23 12.21 0.00 257.46 5.67 107.37 181.84 4.39 34.52 104.41 11.27 0.57 55.44 0.29 0.82 3.52 3.85 0.00 2.12 0.58 0.64 0.00 0 0 0.01 69.92 258.03 0.00
ARM 928.54 56.05 43.46 3.85 11.07 0.00 0.00 32.76 122.83 0.00 0.01 0.00 0.00 0.00 0 0 259.17 40.41 7.47 14.32 10.29 26.30 10.69 0.16 0.00 0.17 0.0 0.78 1.42 4.66 0.85 3.62 0.03 199.53 0.02 0.00 0.00 0.00 2.45 0.20 1.55 0 3.26 47.26 27.71 10.75 110.98 11.19 1.38 0.60 0.05 15.30 0.00 19.35 0.76 2.31 18.17 93.77 6.24 59.00 0.05 0.71 1.17 0.00 0.36 3.79 14.11 0.00 5.57 1.82 138.20 23.54 68.92 70.07 0.20 21.92 135.47 11.82 22.88 50.18 6.86 7.35 0.09 2.53 0.41 0.52 0.02 0.03 0.00 0 0 0.00 39.93 414.62 22.91
ATG 494.44 0.09 13.87 0.01 12.60 0.00 0.00 2.42 40.46 4.47 4.55 3.18 4.35 0.00 0 0 180.99 30.88 9.08 3.47 10.78 2.72 11.11 0.71 0.85 3.84 0.0 7.13 1.03 0.94 0.65 90.49 1.85 0.97 10.80 0.04 0.00 0.00 1.59 1.47 22.02 0 2.86 106.71 4.08 5.98 44.94 24.35 4.54 2.12 3.23 21.00 8.59 8.44 3.52 0.87 5.96 130.80 7.94 24.48 0.09 3.65 1.68 0.60 12.50 24.09 30.86 7.51 60.73 7.88 47.24 6.67 63.59 258.92 1.18 9.52 86.81 34.79 18.93 35.42 2.95 2.32 5.35 27.14 50.75 6.33 0.84 3.35 0.00 0 0 1.44 94.06 148.51 18.48
AUS 542.77 0.00 28.54 0.19 0.51 0.00 0.06 12.89 86.74 0.82 6.61 0.30 0.03 0.00 0 0 345.02 73.11 32.89 8.19 27.47 84.12 28.37 1.55 0.00 0.49 0.0 8.29 0.00 2.80 0.01 11.07 1.31 14.10 279.43 47.51 0.00 49.83 1.97 4.68 48.22 0 1.74 143.35 12.25 10.31 54.25 7.83 0.76 0.15 0.95 17.44 0.00 16.71 4.33 1.31 9.12 45.57 7.11 25.13 0.22 1.10 1.83 0.05 3.64 65.59 106.05 2.17 51.66 0.00 96.62 58.57 108.68 185.62 3.00 27.04 50.41 74.21 1.83 24.83 4.53 8.59 5.30 12.82 5.23 4.17 2.48 1.09 0.08 0 0 0.10 83.69 330.74 40.55
AUT 677.19 2.44 83.33 66.75 20.51 0.00 0.00 15.01 82.13 0.00 1.19 0.06 0.00 0.00 0 0 330.64 42.62 1.49 1.93 12.48 37.55 19.76 40.37 12.74 0.31 0.0 2.92 2.06 2.35 2.52 62.14 1.62 130.26 208.37 0.00 0.00 0.00 12.92 0.36 25.74 0 0.00 218.46 14.99 10.20 56.01 13.93 2.78 0.36 1.04 30.74 0.02 35.77 4.42 1.35 8.04 44.05 3.31 13.71 0.15 1.91 2.85 0.05 5.86 59.42 127.72 0.23 21.34 0.06 55.04 5.89 199.15 71.88 1.57 13.49 228.94 114.25 48.67 52.92 8.83 7.46 4.44 11.35 1.41 1.53 0.53 0.23 0.02 0 0 0.01 36.68 369.33 0.00
AZE 1247.15 39.55 76.99 1.67 1.63 0.00 0.00 11.50 158.27 0.00 0.00 0.00 0.00 0.00 0 0 250.58 37.72 6.20 6.83 6.83 51.13 10.29 0.02 0.00 0.09 0.0 0.85 0.00 0.57 0.00 4.37 0.00 77.84 2.78 0.00 0.00 0.05 1.45 0.01 3.11 0 29.45 154.75 36.55 25.12 66.36 7.14 1.92 0.08 0.62 9.07 0.00 21.60 0.25 3.90 31.79 54.65 1.93 31.81 0.50 0.30 0.16 0.00 1.13 1.76 5.95 0.06 86.01 1.93 83.56 52.23 5.38 60.67 0.01 15.90 65.91 82.57 0.04 34.42 5.45 1.36 0.18 2.45 0.32 0.04 0.02 0.01 0.00 0 0 0.00 52.29 283.07 9.38
Profil en ligne
2511 2513 2514 2515 2516 2517 2518 2520 2531 2532 2533 2534 2535 2536 2537 2541 2542 2543 2546 2547 2549 2551 2552 2555 2557 2558 2559 2560 2561 2563 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2586 2601 2602 2605 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2625 2630 2633 2635 2640 2641 2642 2645 2655 2656 2657 2658 2680 2731 2732 2733 2734 2735 2736 2737 2740 2743 2744 2745 2761 2762 2763 2764 2765 2766 2767 2769 2775 2781 2782 2807 2848 2899
AFG 58.71 0.02 2.03 0.06 0.00 0.03 0 0.26 1.98 0.00 0.00 0.00 0.00 0 0 0 3.71 0.07 0.00 0.76 0.44 0.94 0.19 0.00 0.00 0.03 0.0 0.00 0.48 0.00 0.01 0.12 0.00 1.09 0.01 0.17 0.00 6.80 0.00 0.07 0.02 0 0.01 1.35 0.00 0.29 1.49 0.09 0.00 0.00 0.05 0.19 0.00 0.41 0.00 0.19 1.06 0.51 0.00 0.04 0.01 0.00 0.00 0 0.15 0.00 0.00 0.00 0.00 0.01 0.50 0.84 0.00 0.15 0.05 0.14 0.37 0.94 0.00 0.11 0.01 0.03 0.00 0.01 0.00 0.00 0.00 0.00 0 0 0 0 9.02 3.80 0.15
AGO 5.68 0.00 16.80 0.00 0.02 0.27 0 0.01 0.57 30.75 4.12 0.00 0.00 0 0 0 3.36 0.35 2.21 0.00 0.28 0.04 1.58 0.25 0.00 0.00 0.0 0.01 0.05 0.01 0.00 2.97 0.88 0.23 0.00 0.01 0.23 5.13 0.00 0.00 0.10 0 0.00 0.70 0.38 0.01 1.11 0.01 0.00 0.00 0.82 2.82 0.00 0.01 0.60 0.00 0.00 1.16 0.03 0.08 0.00 0.00 0.00 0 0.01 0.11 1.65 1.11 0.19 0.06 0.78 0.12 2.23 1.63 0.04 0.32 0.30 0.03 0.02 0.26 0.23 0.10 0.14 0.74 0.06 0.00 0.00 0.00 0 0 0 0 5.31 0.72 0.17
ALB 25.20 0.06 0.64 0.15 0.05 0.00 0 0.02 2.62 0.00 0.00 0.00 0.00 0 0 0 5.58 1.18 1.32 0.00 0.13 1.62 0.18 0.00 0.00 0.02 0.0 0.03 0.00 1.10 0.00 0.18 0.04 2.70 0.00 0.00 0.00 0.00 0.00 0.00 3.53 0 0.18 0.82 1.58 1.11 4.73 0.46 0.10 0.00 0.01 0.50 0.00 1.40 0.01 0.62 3.89 2.46 0.06 0.88 0.01 0.01 0.06 0 0.00 0.16 1.31 0.00 0.29 0.05 2.24 1.37 1.05 2.53 0.06 0.53 1.51 0.67 0.05 1.74 0.48 0.10 0.04 0.22 0.02 0.05 0.06 0.01 0 0 0 0 1.70 17.98 0.59
ARE 25.91 0.00 0.75 0.00 0.03 0.00 0 0.01 1.78 0.08 0.06 0.01 0.02 0 0 0 6.26 0.55 0.85 0.85 3.40 1.20 0.28 0.03 0.08 0.03 0.3 0.27 0.38 0.18 0.01 1.53 0.11 0.65 3.76 0.02 0.06 9.55 0.42 0.04 1.05 0 0.94 0.57 0.14 0.65 1.71 0.37 0.12 0.01 0.02 0.64 0.12 0.58 0.06 1.34 0.19 0.63 0.02 0.73 0.00 0.08 0.26 0 0.39 0.12 0.20 0.00 0.00 0.11 1.77 1.34 0.13 6.27 0.36 0.24 0.52 1.47 0.00 1.24 0.26 0.35 0.25 0.70 0.02 0.04 0.05 0.01 0 0 0 0 9.34 4.72 0.41
ARG 26.19 0.00 2.78 0.00 0.23 0.00 0 0.10 1.92 0.17 0.19 0.00 0.00 0 0 0 9.24 1.84 1.08 0.81 0.57 0.17 0.78 0.00 0.00 0.00 0.0 0.04 0.00 0.06 0.00 1.99 0.63 8.58 0.04 0.04 0.00 0.00 0.03 0.00 0.16 0 0.90 0.43 0.47 0.35 0.86 0.71 0.23 0.02 0.00 0.71 0.00 0.18 0.02 0.00 0.18 0.51 0.06 0.37 0.25 0.01 0.03 0 0.03 1.22 1.52 0.01 0.36 0.00 7.69 0.17 3.21 5.43 0.13 1.03 3.12 0.34 0.02 1.66 0.01 0.02 0.11 0.12 0.00 0.06 0.02 0.02 0 0 0 0 2.09 7.71 0.00

The code of products will probably look very abstract for the reader but you can find more detailed explication on FAO website

To measure the dissimilarity between two countries, we decide to use the Whittaker distance measure, which is the sum of the absolute values of the percentage differences divided by 2 :

\(D_{ij} = \frac{1}{2}\sum_{k=1}^K |\frac{x_{ik}}{x_{i.}}-\frac{x_{jk}}{x_{j.}}|\)

Calculation is very easy with the dist.ldc function of the package adespatial, which is optimized for calculation on large arrays (which is not the case here).

Info -- For this coefficient, sqrt(D) would be Euclidean
Extrait de la matrice de dissimilarité
AFG AGO ALB ARE ARG
AFG 0.000 0.719 0.521 0.370 0.526
AGO 0.719 0.000 0.737 0.658 0.680
ALB 0.521 0.737 0.000 0.382 0.345
ARE 0.370 0.658 0.382 0.000 0.367
ARG 0.526 0.680 0.345 0.367 0.000

As in the case of the geopolitical network, the choice of the dissimilarity matrix is likely to be criticized because they are many possible options that could obviously produce different results. We could firstly choose another measuer of dissimilarity (many of them are presented and discussed in Guénard and Legendre (2022)). But we could also decide to use a different weight for the measure of the role of the 95 food items. Instead of Kcal/, we could have simply use the Kg/capita. Another option could be to focus on proteins or fat present in each item. In this cases some items would have an increase or decrease of their influence in the elaboration of dissimilarity. Last but not least, the level of disaggregation of food items would modify the results (consider the case of oils …).

Classification

Having defined the weight matrix and the dissimilarity matrix, we can move on to classification, which is carried out using the constr.hclust() function from the adespatial package, whose instructions for use are nicely described in Guénard and Legendre (2022) with an example.

Tree

As in a classical hierarchical classification, we have to analyse the tree before to take a decision on the number of regions we want to build.

The choice is not obvious but we can start with a division in 6 classes

Network visualization of regions

The adespatial package allows us to visualize the result in the form of a graph. The borders of the corresponding countries are superimposed on the graph.

One of the most interesting results of this example if the fact that the regions obtained does not necessarily follow the usual limits of continents, precisely because we have selected a Voronoï-Delaunay network that made possible links at long distance. Two examples can illustrate this point :

  • The region represented in gray associates Europe (including Russia), USA, Canada, Chile, Argentina, Uruguay, Australia and New-Zealand, creating a very good approximation of the so-called “Global North” or “Western”. This result was possible only because of the long distance links creating by the Voronoï-Delaunay network across the Pacific through the … territory of French Polynesia. Should we have eliminated this entity, such a region would not have come to existence !

  • The region represented in blue is associated to the majority of muslim countries and associate Northern Africa with Middle East, Central Asia and part of the Indian Ocean. Huntington’s fan would probably interpret this result through the lenz of religion (e.g. the fact that alcohol is forbidden) but it is not perfectly true because many other products contribute to the result as we will see below. And as in the previous case, the result is strongly dependent from the geopolitical network adopted.

To summarize : this example illustrate that regionalizations of the world based on a criterium of homogeneity are always a compromise between the choice of the geopolitical network used as weight matrix and the choice of the dissimilarity measure. The big danger from scientific point of view is to consider that only one factor is at stake when both are clearly combined.

Specificities of region

We use the catdes() function from the FactoMineR package to analyze class profiles.

Specificities of the regions in terms of energy supply measured in Kcal)
Under-représentation (z-score < -3) Over-représentation (z-score > +3)
Reg. 1 Beer, Pigmeat Wheat and products, Dates, Mutton & Goat Meat, Tomatoes and products, Onions, Nuts and products, Maize Germ Oil, Olives (including preserved)
Reg. 2 Soyabean Oil; Lemons, Limes and products; Grapes and products (excl wine); Oats; Beer; Fruits, other; Sunflowerseed Oil; Cocoa Beans and products; Vegetables, other; Oilcrops Oil, Other; Beverages, Alcoholic; Sweeteners, Other; Nuts and products; Fats, Animals, Raw; Butter, Ghee; Pigmeat; Oranges, Mandarines; Apples and products; Potatoes and products; Poultry Meat; Milk - Excluding Butter; Eggs; Sugar (Raw Equivalent); Wheat and products Groundnuts; Cassava and products; Sorghum and products; Groundnut Oil; Oilcrops, Other; Yams; Millet and products; Sesame seed; Beverages, Fermented; Plantains; Palm Oil; Palmkernel Oil; Pulses, Other and products
Reg. 3 Pulses, Other and products, Cassava and products, Maize and products, Palm Oil, Rice and products Milk - Excluding Butter, Wine, Beer, Pigmeat, Butter, Ghee, Potatoes and products, Honey, Fats, Animals, Raw, Sunflowerseed Oil, Apples and products, Cream, Cocoa Beans and products, Rye and products, Oilcrops Oil, Other, Oats, Eggs, Nuts and products, Wheat and products, Sweeteners, Other, Rape and Mustard Oil, Grapes and products (excl wine), Olive Oil, Bovine Meat, Demersal Fish, Beverages, Alcoholic, Sugar (Raw Equivalent), Coffee and products, Tomatoes and products, Oranges, Mandarines
Reg. 4 Sunflowerseed Oil; Nuts and products Pineapples and products; Poultry Meat; Soyabean Oil; Oranges, Mandarines; Lemons, Limes and products; Infant food; Sugar (Raw Equivalent); Grapefruit and products; Fruits, other; Maize and products
Reg. 5 Tomatoes and products; Wheat and products Rice and products; Freshwater Fish; Ricebran Oil; Sugar cane; Pimento; Palmkernel Oil
Reg. 6 Vegetables, other, Poultry Meat, Eggs Maize and products, Beverages, Fermented
Reg. 7 Milk - Excluding Butter Coconuts - Incl Copra; Coconut Oil; Aquatic Animals, Others; Molluscs, Other; Marine Fish, Other; Roots, Other; Aquatic Plants; Pelagic Fish; Pigmeat; Sweet potatoes; Soyabeans; Miscellaneous; Poultry Meat; Vegetables, other; Cephalopods; Crustaceans

Final map

References

Guénard, Guillaume, and Pierre Legendre. 2022. “Hierarchical Clustering with Contiguity Constraint in r.” Journal of Statistical Software 103: 126. https://www.jstatsoft.org/article/view/v103i07.